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REASONS FOR REVIEW 



Claims 1-30 have been rejected under 35 U.S.C. §1 02(e) as anticipated by U.S. 
Patent No 6,535,871 (Romansky). The present invention relates to a method and 
apparatus that allows the contents of secure documents to be indexed by conventional 
search engines without making the plain text available. In particular, a text stream 
derived from the entire document content is broken into two to five word phrases, the 
phrases are then randomized and a text file is created from the randomized stream. 
This process produces a scrambled text file that cannot be read by humans, but which 
contains nearly all of the words in the original document and most of the phrases. In 
particular, the word frequency and word context is largely retained. Third party search 
engines are allowed to index the scrambled file so that search algorithms that search on 
particular words or phrases produce nearly the same number of hits as with the plain 
text file. 

The Romansky reference discloses a method and apparatus for generating a 
plain text index from a secure document. As disclosed, the index is generated by 
extracting single keywords from the plain text of the secure document, eliminating 
"problematic" keywords, such as names, from the resulting keyword list to produce a 
reduced index and then scrambling the keywords in the reduced index. This is set forth 
in Romansky , column 2, line 66 - column 3, line 34 in relation to Figure 1 . Since the 
Romansky index generating method uses single keywords instead of multi-word 
phrases, the method loses most of the word context. This is noted in Romansky at 
column 1 , line 66 - column 2, line 1 . Thus, search algorithms that search on particular 
words or phrases will generally not produce nearly the same number of hits as with the 
plain text file with the Romansky scrambled word index. 

These differences are recited in the claims. Claim 1 is illustrative. It recites, in 
lines 5-6, "...fragmenting the text stream into multi-word phrases" and "randomly 
assembling the phrases into a scrambled document..." As discussed above, Romansky 
generates a keyword list and then scrambles the words on that list. The examiner 
points to Romansky , column 2, lines 1 1 -32 and column 3, lines 6-1 7 as disclosing 
fragmenting a text stream into multi-word phrases. However, at these locations, 
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Romanskv is clearly discussing words "out of context" or "tokens" (words with leading 
and trailing spaces removed, and with punctuation and duplicates removed). There is 
no discussion of using multi-word phrases. The examiner also argues that because the 
dictionary definition of the word "phrase" includes one word or a plurality of words, the 
term "multi-word phrase" could read on a single word. While Applicant could agree with 
the examiner if the term in question were just "phrase", it would appear that the 
adjective "multi-word" excludes a single word. In any case, claim 2 recites that a "multi- 
word phrase" has two or more words thereby explicitly excluding a single word. 

Similarly, Romanskv scrambles words not phrases. The examiner points to 
Romanskv , column 2, lines 33-40 and column 3, lines 24-35, as disclosing this step. 
However, at these locations, it is clear that Romanskv mentions phrases in connection 
with methods for insuring that they do not occur in the output in order to prevent 
information that might be contained in the phrases from appearing in the scrambled 
index. Romanskv column 3, lines 24-26, states that the tokens are randomized. The 
examiner further argues that, because Romanskv uses words in the plural, discusses 
combinations of words in which the relation of the words is significant and has the ability 
to search and conceal multi-word phrases, the step of randomly assembling the phrases 
into a scrambled output document recited in claim 1 somehow reads onto the 
Romanskv disclosure. However, as set forth above, Romanskv does all of the things 
mentioned by the examiner in order to prevent multi-word phrases from appearing in the 
output. Therefore, the recited step is directly against the teaching of Romanskv . Claim 
1 clearly recites steps not disclosed in Romanskv and thereby distinguishes over the 
Romanskv reference. 

Claims 2-10 are dependent on claim 1 and include the recited steps. Therefore, 
they also distinguish over the cited Romanskv reference in same manner as claim 1 . In 
addition, these claims recite additional limitations not disclosed in the Romanskv 
reference. For example, claims 2-4 recite that the multi-word phrases contain, at least 
two words, a random number of words and a maximum of five words, respectively. In 
Romanskv all word tokens consist of a single word. Thus, claims 2-4 distinguish over 
the cited reference for this reason also. In addition, claim 5 recites that the position of 
word phrases is swapped in the text stream. In Romanskv . word positions are 
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swapped. Thus, claim 5 also distinguishes over the cited reference in the same manner 
as claim 1 . 

Claim 11 is an apparatus claim that contains limitations that parallel those in 
claim 1 . Therefore, it distinguishes over the cited Romansky reference in the same 
manner as claim 1 . Claims 1 2-20 are dependent on claim 1 1 and include the recited 
limitations. Therefore, they also distinguish over the cited Romansky reference in same 
manner as claim 1 1 . In addition, these claims recite additional limitations not disclosed 
in the Romansky reference. For example, claims 12-15 recite limitations that parallel 
those recited in claims 2-5 and distinguish over the cited reference in a manner similar 
to that discussed above. 

Claim 21 is a computer program product claim that contains limitations that 
parallel those in claims 1 and 1 1 . Therefore, it distinguishes over the cited Romansky 
reference in the same manner as claims 1 and 1 1 . Claims 22-30 are dependent on 
claim 21 and include the recited limitations. Therefore, they also distinguish over the 
cited Romansky reference in same manner as claim 21 . In addition, these claims recite 
additional limitations not disclosed in the Romansky reference. For example, claims 22- 
25 recite limitations that parallel those recited in claims 2-5 and distinguish over the 
cited reference in a manner similar to that discussed above. 
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